This report explores loan data from Prosper with 113937 observations and 81 data points extracted. As of time of writing, brief descriptions of data points can be found here.
## 'data.frame': 113937 obs. of 84 variables:
## $ ListingKey : Factor w/ 113066 levels "00003546482094282EF90E5",..: 7180 7193 6647 6669 6686 6689 6699 6706 6687 6687 ...
## $ ListingNumber : int 193129 1209647 81716 658116 909464 1074836 750899 768193 1023355 1023355 ...
## $ ListingCreationDate : POSIXct, format: "2007-08-26 19:09:29" "2014-02-27 08:28:07" ...
## $ CreditGrade : Ord.factor w/ 8 levels "NC"<"HR"<"E"<..: 5 NA 2 NA NA NA NA NA NA NA ...
## $ Term : Ord.factor w/ 3 levels "12"<"36"<"60": 2 2 2 2 2 3 2 2 2 2 ...
## $ LoanStatus : Factor w/ 12 levels "Cancelled","Chargedoff",..: 3 4 3 4 4 4 4 4 4 4 ...
## $ ClosedDate : POSIXct, format: "2009-08-14" NA ...
## $ BorrowerAPR : num 0.165 0.12 0.283 0.125 0.246 ...
## $ BorrowerRate : num 0.158 0.092 0.275 0.0974 0.2085 ...
## $ LenderYield : num 0.138 0.082 0.24 0.0874 0.1985 ...
## $ EstimatedEffectiveYield : num NA 0.0796 NA 0.0849 0.1832 ...
## $ EstimatedLoss : num NA 0.0249 NA 0.0249 0.0925 ...
## $ EstimatedReturn : num NA 0.0547 NA 0.06 0.0907 ...
## $ ProsperRating..numeric. : int NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperRating..Alpha. : Ord.factor w/ 7 levels "HR"<"E"<"D"<"C"<..: NA 6 NA 6 3 5 2 4 7 7 ...
## $ ProsperScore : num NA 7 NA 9 4 10 2 4 9 11 ...
## $ ListingCategory..numeric. : int 0 2 0 16 2 1 1 2 7 7 ...
## $ BorrowerState : Factor w/ 52 levels "","AK","AL","AR",..: 7 7 12 12 25 34 18 6 16 16 ...
## $ Occupation : Factor w/ 68 levels "","Accountant/CPA",..: 37 43 37 52 21 43 50 29 24 24 ...
## $ EmploymentStatus : Factor w/ 9 levels "","Employed",..: 9 2 4 2 2 2 2 2 2 2 ...
## $ EmploymentStatusDuration : int 2 44 NA 113 44 82 172 103 269 269 ...
## $ IsBorrowerHomeowner : logi TRUE FALSE FALSE TRUE TRUE TRUE ...
## $ CurrentlyInGroup : logi TRUE FALSE TRUE FALSE FALSE FALSE ...
## $ GroupKey : Factor w/ 707 levels "","00343376901312423168731",..: 1 1 335 1 1 1 1 1 1 1 ...
## $ DateCreditPulled : POSIXct, format: "2007-08-26 18:41:46" "2014-02-27 08:28:14" ...
## $ CreditScoreRangeLower : int 640 680 480 800 680 740 680 700 820 820 ...
## $ CreditScoreRangeUpper : int 659 699 499 819 699 759 699 719 839 839 ...
## $ FirstRecordedCreditLine : POSIXct, format: "2001-10-11" "1996-03-18" ...
## $ CurrentCreditLines : int 5 14 NA 5 19 21 10 6 17 17 ...
## $ OpenCreditLines : int 4 14 NA 5 19 17 7 6 16 16 ...
## $ TotalCreditLinespast7years : int 12 29 3 29 49 49 20 10 32 32 ...
## $ OpenRevolvingAccounts : int 1 13 0 7 6 13 6 5 12 12 ...
## $ OpenRevolvingMonthlyPayment : num 24 389 0 115 220 1410 214 101 219 219 ...
## $ InquiriesLast6Months : int 3 3 0 0 1 0 0 3 1 1 ...
## $ TotalInquiries : num 3 5 1 1 9 2 0 16 6 6 ...
## $ CurrentDelinquencies : int 2 0 1 4 0 0 0 0 0 0 ...
## $ AmountDelinquent : num 472 0 NA 10056 0 ...
## $ DelinquenciesLast7Years : int 4 0 0 14 0 0 0 0 0 0 ...
## $ PublicRecordsLast10Years : int 0 1 0 0 0 0 0 1 0 0 ...
## $ PublicRecordsLast12Months : int 0 0 NA 0 0 0 0 0 0 0 ...
## $ RevolvingCreditBalance : num 0 3989 NA 1444 6193 ...
## $ BankcardUtilization : num 0 0.21 NA 0.04 0.81 0.39 0.72 0.13 0.11 0.11 ...
## $ AvailableBankcardCredit : num 1500 10266 NA 30754 695 ...
## $ TotalTrades : num 11 29 NA 26 39 47 16 10 29 29 ...
## $ TradesNeverDelinquent..percentage. : num 0.81 1 NA 0.76 0.95 1 0.68 0.8 1 1 ...
## $ TradesOpenedLast6Months : num 0 2 NA 0 2 0 0 0 1 1 ...
## $ DebtToIncomeRatio : num 0.17 0.18 0.06 0.15 0.26 0.36 0.27 0.24 0.25 0.25 ...
## $ IncomeRange : Ord.factor w/ 6 levels "$0"<"$1-24,999"<..: 3 4 NA 3 6 6 3 3 3 3 ...
## $ IncomeVerifiable : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
## $ StatedMonthlyIncome : num 3083 6125 2083 2875 9583 ...
## $ LoanKey : Factor w/ 113066 levels "00003683605746079487FF7",..: 100337 69837 46303 70776 71387 86505 91250 5425 908 908 ...
## $ TotalProsperLoans : int NA NA NA NA 1 NA NA NA NA NA ...
## $ TotalProsperPaymentsBilled : int NA NA NA NA 11 NA NA NA NA NA ...
## $ OnTimeProsperPayments : int NA NA NA NA 11 NA NA NA NA NA ...
## $ ProsperPaymentsLessThanOneMonthLate: int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPaymentsOneMonthPlusLate : int NA NA NA NA 0 NA NA NA NA NA ...
## $ ProsperPrincipalBorrowed : num NA NA NA NA 11000 NA NA NA NA NA ...
## $ ProsperPrincipalOutstanding : num NA NA NA NA 9948 ...
## $ ScorexChangeAtTimeOfListing : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanCurrentDaysDelinquent : int 0 0 0 0 0 0 0 0 0 0 ...
## $ LoanFirstDefaultedCycleNumber : int NA NA NA NA NA NA NA NA NA NA ...
## $ LoanMonthsSinceOrigination : int 78 0 86 16 6 3 11 10 3 3 ...
## $ LoanNumber : int 19141 134815 6466 77296 102670 123257 88353 90051 121268 121268 ...
## $ LoanOriginalAmount : int 9425 10000 3001 10000 15000 15000 3000 10000 10000 10000 ...
## $ LoanOriginationDate : POSIXct, format: "2007-09-12" "2014-03-03" ...
## $ LoanOriginationQuarter : Ord.factor w/ 33 levels "Q4 2005"<"Q1 2006"<..: 8 33 6 28 31 32 30 30 32 32 ...
## $ MemberKey : Factor w/ 90831 levels "00003397697413387CAF966",..: 11071 10302 33781 54939 19465 48037 60448 40951 26129 26129 ...
## $ MonthlyLoanPayment : num 330 319 123 321 564 ...
## $ LP_CustomerPayments : num 11396 0 4187 5143 2820 ...
## $ LP_CustomerPrincipalPayments : num 9425 0 3001 4091 1563 ...
## $ LP_InterestandFees : num 1971 0 1186 1052 1257 ...
## $ LP_ServiceFees : num -133.2 0 -24.2 -108 -60.3 ...
## $ LP_CollectionFees : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_GrossPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NetPrincipalLoss : num 0 0 0 0 0 0 0 0 0 0 ...
## $ LP_NonPrincipalRecoverypayments : num 0 0 0 0 0 0 0 0 0 0 ...
## $ PercentFunded : num 1 1 1 1 1 1 1 1 1 1 ...
## $ Recommendations : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsCount : int 0 0 0 0 0 0 0 0 0 0 ...
## $ InvestmentFromFriendsAmount : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Investors : int 258 1 41 158 20 1 1 1 1 1 ...
## $ ListingCategory : Factor w/ 20 levels "Debt Consolidation",..: NA 2 NA 16 2 1 1 2 7 7 ...
## $ CreditGrade.ProsperRating : Ord.factor w/ 8 levels "NC"<"HR"<"E"<..: 5 7 2 7 4 6 3 5 8 8 ...
## $ past.due.days : Ord.factor w/ 6 levels "Past Due (1-15 days)"<..: NA NA NA NA NA NA NA NA NA NA ...
## ListingKey ListingNumber
## 17A93590655669644DB4C06: 6 Min. : 4
## 349D3587495831350F0F648: 4 1st Qu.: 400919
## 47C1359638497431975670B: 4 Median : 600554
## 8474358854651984137201C: 4 Mean : 627886
## DE8535960513435199406CE: 4 3rd Qu.: 892634
## 04C13599434217079754AEE: 3 Max. :1255725
## (Other) :113912
## ListingCreationDate CreditGrade Term
## Min. :2005-11-09 20:44:28 C : 5649 12: 1614
## 1st Qu.:2008-09-19 10:02:14 D : 5153 36:87778
## Median :2012-06-16 12:37:19 B : 4389 60:24545
## Mean :2011-07-09 08:33:43 AA : 3509
## 3rd Qu.:2013-09-09 19:40:48 HR : 3508
## Max. :2014-03-10 12:20:53 (Other): 6745
## NA's :84984
## LoanStatus ClosedDate
## Current :56576 Min. :2005-11-25 00:00:00
## Completed :38074 1st Qu.:2009-07-14 00:00:00
## Chargedoff :11992 Median :2011-04-05 00:00:00
## Defaulted : 5018 Mean :2011-03-07 19:48:20
## Past Due (1-15 days) : 806 3rd Qu.:2013-01-30 00:00:00
## Past Due (31-60 days): 363 Max. :2014-03-10 00:00:00
## (Other) : 1108 NA's :58848
## BorrowerAPR BorrowerRate LenderYield
## Min. :0.00653 Min. :0.0000 Min. :-0.0100
## 1st Qu.:0.15629 1st Qu.:0.1340 1st Qu.: 0.1242
## Median :0.20976 Median :0.1840 Median : 0.1730
## Mean :0.21883 Mean :0.1928 Mean : 0.1827
## 3rd Qu.:0.28381 3rd Qu.:0.2500 3rd Qu.: 0.2400
## Max. :0.51229 Max. :0.4975 Max. : 0.4925
## NA's :25
## EstimatedEffectiveYield EstimatedLoss EstimatedReturn
## Min. :-0.183 Min. :0.005 Min. :-0.183
## 1st Qu.: 0.116 1st Qu.:0.042 1st Qu.: 0.074
## Median : 0.162 Median :0.072 Median : 0.092
## Mean : 0.169 Mean :0.080 Mean : 0.096
## 3rd Qu.: 0.224 3rd Qu.:0.112 3rd Qu.: 0.117
## Max. : 0.320 Max. :0.366 Max. : 0.284
## NA's :29084 NA's :29084 NA's :29084
## ProsperRating..numeric. ProsperRating..Alpha. ProsperScore
## Min. :1.000 C :18345 Min. : 1.00
## 1st Qu.:3.000 B :15581 1st Qu.: 4.00
## Median :4.000 A :14551 Median : 6.00
## Mean :4.072 D :14274 Mean : 5.95
## 3rd Qu.:5.000 E : 9795 3rd Qu.: 8.00
## Max. :7.000 (Other):12307 Max. :11.00
## NA's :29084 NA's :29084 NA's :29084
## ListingCategory..numeric. BorrowerState
## Min. : 0.000 CA :14717
## 1st Qu.: 1.000 TX : 6842
## Median : 1.000 NY : 6729
## Mean : 2.774 FL : 6720
## 3rd Qu.: 3.000 IL : 5921
## Max. :20.000 : 5515
## (Other):67493
## Occupation EmploymentStatus
## Other :28617 Employed :67322
## Professional :13628 Full-time :26355
## Computer Programmer : 4478 Self-employed: 6134
## Executive : 4311 Not available: 5347
## Teacher : 3759 Other : 3806
## Administrative Assistant: 3688 : 2255
## (Other) :55456 (Other) : 2718
## EmploymentStatusDuration IsBorrowerHomeowner CurrentlyInGroup
## Min. : 0.00 Mode :logical Mode :logical
## 1st Qu.: 26.00 FALSE:56459 FALSE:101218
## Median : 67.00 TRUE :57478 TRUE :12719
## Mean : 96.07
## 3rd Qu.:137.00
## Max. :755.00
## NA's :7625
## GroupKey DateCreditPulled
## :100596 Min. :2005-11-09 00:30:04
## 783C3371218786870A73D20: 1140 1st Qu.:2008-09-16 22:31:26
## 3D4D3366260257624AB272D: 916 Median :2012-06-17 08:01:23
## 6A3B336601725506917317E: 698 Mean :2011-07-09 16:14:50
## FEF83377364176536637E50: 611 3rd Qu.:2013-09-11 14:31:19
## C9643379247860156A00EC0: 342 Max. :2014-03-10 12:20:56
## (Other) : 9634 NA's :1
## CreditScoreRangeLower CreditScoreRangeUpper FirstRecordedCreditLine
## Min. : 0.0 Min. : 19.0 Min. :1947-08-24 00:00:00
## 1st Qu.:660.0 1st Qu.:679.0 1st Qu.:1990-06-01 00:00:00
## Median :680.0 Median :699.0 Median :1995-11-01 00:00:00
## Mean :685.6 Mean :704.6 Mean :1994-11-17 06:23:33
## 3rd Qu.:720.0 3rd Qu.:739.0 3rd Qu.:2000-03-14 00:00:00
## Max. :880.0 Max. :899.0 Max. :2012-12-22 00:00:00
## NA's :591 NA's :591 NA's :697
## CurrentCreditLines OpenCreditLines TotalCreditLinespast7years
## Min. : 0.00 Min. : 0.00 Min. : 2.00
## 1st Qu.: 7.00 1st Qu.: 6.00 1st Qu.: 17.00
## Median :10.00 Median : 9.00 Median : 25.00
## Mean :10.32 Mean : 9.26 Mean : 26.75
## 3rd Qu.:13.00 3rd Qu.:12.00 3rd Qu.: 35.00
## Max. :59.00 Max. :54.00 Max. :136.00
## NA's :7604 NA's :7604 NA's :697
## OpenRevolvingAccounts OpenRevolvingMonthlyPayment InquiriesLast6Months
## Min. : 0.00 Min. : 0.0 Min. : 0.000
## 1st Qu.: 4.00 1st Qu.: 114.0 1st Qu.: 0.000
## Median : 6.00 Median : 271.0 Median : 1.000
## Mean : 6.97 Mean : 398.3 Mean : 1.435
## 3rd Qu.: 9.00 3rd Qu.: 525.0 3rd Qu.: 2.000
## Max. :51.00 Max. :14985.0 Max. :105.000
## NA's :697
## TotalInquiries CurrentDelinquencies AmountDelinquent
## Min. : 0.000 Min. : 0.0000 Min. : 0.0
## 1st Qu.: 2.000 1st Qu.: 0.0000 1st Qu.: 0.0
## Median : 4.000 Median : 0.0000 Median : 0.0
## Mean : 5.584 Mean : 0.5921 Mean : 984.5
## 3rd Qu.: 7.000 3rd Qu.: 0.0000 3rd Qu.: 0.0
## Max. :379.000 Max. :83.0000 Max. :463881.0
## NA's :1159 NA's :697 NA's :7622
## DelinquenciesLast7Years PublicRecordsLast10Years
## Min. : 0.000 Min. : 0.0000
## 1st Qu.: 0.000 1st Qu.: 0.0000
## Median : 0.000 Median : 0.0000
## Mean : 4.155 Mean : 0.3126
## 3rd Qu.: 3.000 3rd Qu.: 0.0000
## Max. :99.000 Max. :38.0000
## NA's :990 NA's :697
## PublicRecordsLast12Months RevolvingCreditBalance BankcardUtilization
## Min. : 0.000 Min. : 0 Min. :0.000
## 1st Qu.: 0.000 1st Qu.: 3121 1st Qu.:0.310
## Median : 0.000 Median : 8549 Median :0.600
## Mean : 0.015 Mean : 17599 Mean :0.561
## 3rd Qu.: 0.000 3rd Qu.: 19521 3rd Qu.:0.840
## Max. :20.000 Max. :1435667 Max. :5.950
## NA's :7604 NA's :7604 NA's :7604
## AvailableBankcardCredit TotalTrades
## Min. : 0 Min. : 0.00
## 1st Qu.: 880 1st Qu.: 15.00
## Median : 4100 Median : 22.00
## Mean : 11210 Mean : 23.23
## 3rd Qu.: 13180 3rd Qu.: 30.00
## Max. :646285 Max. :126.00
## NA's :7544 NA's :7544
## TradesNeverDelinquent..percentage. TradesOpenedLast6Months
## Min. :0.000 Min. : 0.000
## 1st Qu.:0.820 1st Qu.: 0.000
## Median :0.940 Median : 0.000
## Mean :0.886 Mean : 0.802
## 3rd Qu.:1.000 3rd Qu.: 1.000
## Max. :1.000 Max. :20.000
## NA's :7544 NA's :7544
## DebtToIncomeRatio IncomeRange IncomeVerifiable
## Min. : 0.000 $0 : 621 Mode :logical
## 1st Qu.: 0.140 $1-24,999 : 7274 FALSE:8669
## Median : 0.220 $25,000-49,999:32192 TRUE :105268
## Mean : 0.276 $50,000-74,999:31050
## 3rd Qu.: 0.320 $75,000-99,999:16916
## Max. :10.010 $100,000+ :17337
## NA's :8554 NA's : 8547
## StatedMonthlyIncome LoanKey TotalProsperLoans
## Min. : 0 CB1B37030986463208432A1: 6 Min. :0.00
## 1st Qu.: 3200 2DEE3698211017519D7333F: 4 1st Qu.:1.00
## Median : 4667 9F4B37043517554537C364C: 4 Median :1.00
## Mean : 5608 D895370150591392337ED6D: 4 Mean :1.42
## 3rd Qu.: 6825 E6FB37073953690388BC56D: 4 3rd Qu.:2.00
## Max. :1750003 0D8F37036734373301ED419: 3 Max. :8.00
## (Other) :113912 NA's :91852
## TotalProsperPaymentsBilled OnTimeProsperPayments
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 9.00 1st Qu.: 9.00
## Median : 16.00 Median : 15.00
## Mean : 22.93 Mean : 22.27
## 3rd Qu.: 33.00 3rd Qu.: 32.00
## Max. :141.00 Max. :141.00
## NA's :91852 NA's :91852
## ProsperPaymentsLessThanOneMonthLate ProsperPaymentsOneMonthPlusLate
## Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.00
## Median : 0.00 Median : 0.00
## Mean : 0.61 Mean : 0.05
## 3rd Qu.: 0.00 3rd Qu.: 0.00
## Max. :42.00 Max. :21.00
## NA's :91852 NA's :91852
## ProsperPrincipalBorrowed ProsperPrincipalOutstanding
## Min. : 0 Min. : 0
## 1st Qu.: 3500 1st Qu.: 0
## Median : 6000 Median : 1627
## Mean : 8472 Mean : 2930
## 3rd Qu.:11000 3rd Qu.: 4127
## Max. :72499 Max. :23451
## NA's :91852 NA's :91852
## ScorexChangeAtTimeOfListing LoanCurrentDaysDelinquent
## Min. :-209.00 Min. : 0.0
## 1st Qu.: -35.00 1st Qu.: 0.0
## Median : -3.00 Median : 0.0
## Mean : -3.22 Mean : 152.8
## 3rd Qu.: 25.00 3rd Qu.: 0.0
## Max. : 286.00 Max. :2704.0
## NA's :95009
## LoanFirstDefaultedCycleNumber LoanMonthsSinceOrigination LoanNumber
## Min. : 0.00 Min. : 0.0 Min. : 1
## 1st Qu.: 9.00 1st Qu.: 6.0 1st Qu.: 37332
## Median :14.00 Median : 21.0 Median : 68599
## Mean :16.27 Mean : 31.9 Mean : 69444
## 3rd Qu.:22.00 3rd Qu.: 65.0 3rd Qu.:101901
## Max. :44.00 Max. :100.0 Max. :136486
## NA's :96985
## LoanOriginalAmount LoanOriginationDate LoanOriginationQuarter
## Min. : 1000 Min. :2005-11-15 00:00:00 Q4 2013:14450
## 1st Qu.: 4000 1st Qu.:2008-10-02 00:00:00 Q1 2014:12172
## Median : 6500 Median :2012-06-26 00:00:00 Q3 2013: 9180
## Mean : 8337 Mean :2011-07-21 03:44:57 Q2 2013: 7099
## 3rd Qu.:12000 3rd Qu.:2013-09-18 00:00:00 Q3 2012: 5632
## Max. :35000 Max. :2014-03-12 00:00:00 Q2 2012: 5061
## (Other):60343
## MemberKey MonthlyLoanPayment LP_CustomerPayments
## 63CA34120866140639431C9: 9 Min. : 0.0 Min. : -2.35
## 16083364744933457E57FB9: 8 1st Qu.: 131.6 1st Qu.: 1005.76
## 3A2F3380477699707C81385: 8 Median : 217.7 Median : 2583.83
## 4D9C3403302047712AD0CDD: 8 Mean : 272.5 Mean : 4183.08
## 739C338135235294782AE75: 8 3rd Qu.: 371.6 3rd Qu.: 5548.40
## 7E1733653050264822FAA3D: 8 Max. :2251.5 Max. :40702.39
## (Other) :113888
## LP_CustomerPrincipalPayments LP_InterestandFees LP_ServiceFees
## Min. : 0.0 Min. : -2.35 Min. :-664.87
## 1st Qu.: 500.9 1st Qu.: 274.87 1st Qu.: -73.18
## Median : 1587.5 Median : 700.84 Median : -34.44
## Mean : 3105.5 Mean : 1077.54 Mean : -54.73
## 3rd Qu.: 4000.0 3rd Qu.: 1458.54 3rd Qu.: -13.92
## Max. :35000.0 Max. :15617.03 Max. : 32.06
##
## LP_CollectionFees LP_GrossPrincipalLoss LP_NetPrincipalLoss
## Min. :-9274.75 Min. : -94.2 Min. : -954.5
## 1st Qu.: 0.00 1st Qu.: 0.0 1st Qu.: 0.0
## Median : 0.00 Median : 0.0 Median : 0.0
## Mean : -14.24 Mean : 700.4 Mean : 681.4
## 3rd Qu.: 0.00 3rd Qu.: 0.0 3rd Qu.: 0.0
## Max. : 0.00 Max. :25000.0 Max. :25000.0
##
## LP_NonPrincipalRecoverypayments PercentFunded Recommendations
## Min. : 0.00 Min. :0.7000 Min. : 0.00000
## 1st Qu.: 0.00 1st Qu.:1.0000 1st Qu.: 0.00000
## Median : 0.00 Median :1.0000 Median : 0.00000
## Mean : 25.14 Mean :0.9986 Mean : 0.04803
## 3rd Qu.: 0.00 3rd Qu.:1.0000 3rd Qu.: 0.00000
## Max. :21117.90 Max. :1.0125 Max. :39.00000
##
## InvestmentFromFriendsCount InvestmentFromFriendsAmount Investors
## Min. : 0.00000 Min. : 0.00 Min. : 1.00
## 1st Qu.: 0.00000 1st Qu.: 0.00 1st Qu.: 2.00
## Median : 0.00000 Median : 0.00 Median : 44.00
## Mean : 0.02346 Mean : 16.55 Mean : 80.48
## 3rd Qu.: 0.00000 3rd Qu.: 0.00 3rd Qu.: 115.00
## Max. :33.00000 Max. :25000.00 Max. :1189.00
##
## ListingCategory CreditGrade.ProsperRating
## Debt Consolidation:58308 C :23994
## Other :10494 B :19970
## Home Improvement : 7433 D :19427
## Business : 7189 A :17866
## Auto : 2572 E :13084
## (Other) :10976 (Other):19465
## NA's :16965 NA's : 131
## past.due.days
## Past Due (1-15 days) : 806
## Past Due (16-30 days) : 265
## Past Due (31-60 days) : 363
## Past Due (61-90 days) : 313
## Past Due (91-120 days): 304
## Past Due (>120 days) : 16
## NA's :111870
There’s a gap in late 2008 - early 2009. Must be related to some data points only having values until / from 2009.
Besides a slowdown in early 2013, number of listings increases almost exponentially since 2009.
Round figures are very popular for original loan amounts.
Stated monthly income distribution is massively sqewed with some peope claiming to earn over a million per month.
Borrower rate, APR and lender yield all have very similar values and follow very similar distributions.
Debt to income ratio is capped at 10, giving a fat end for the tail of distribution, but majority of listings has much healthier ratios. Let’s zoom in a bit.
A very smooth sqewed distribution here. Nothing unusual about it.
One year term loans are unexpectedly rare.
All listings have at least 1 investor, but very few have more.
Debt consolitation is by far the most popular listing category. Counts between categories vary greatly.
Vast majority of occupations do not fall under predefined categories.
Besides some loan listings actually declaring $0 income, nothing surprising in distribution of income range.
There’s some not fully funded listings. Distribution tail thickens going away from main bulk of values. This is curious.
Small number of extreme values. Having a recommendation is very rare.
Small past due periods are more common than longer ones. Probably becasue this catches some people that forgot to pay in time. Past 15 days, distribution stays fairly flat.
A little over half listings are from home owners.
It’s not typical to belong to a group.
It’s typical to have verifiable income.
113937 observations and 81 data points, plus 3 calculated data points. As of time of writing, brief descriptions of data points can be found here.
With so many data points the dataset could be split into multiple slices each telling something important about it. I’m sure I’ve missed some important features, but of the ones I looked at these were most interesting: LoanOriginalAmount, Term, Investors, PercentFunded, Income, IsBorrowerHomeOwner.
We see same clusters of observations on round numbers. Dots stack to vertical bars, meaning, that most listings tend to fall on small number of LoanOriginalAmount values.
## unique.LoanOriginalAmounts
## 2468
## all.listings
## 113937
## unique.LoanOriginalAmounts.divisible.by.500
## 68
## listings.with.LoanOriginalAmounts.divisible.by.500
## 102273
Most loan amounts are divisible by 500.
Investor counts tends to increase as loan amount increases, but for most ranges small investor counts are more likely.
Clear diagonal lines mark loans with small number of investors. We also have a lot of loans where loan amount per investor is very small, indicating that investing a small amount is popular acros all loan amounts.
We can see loans over $25000 only started appearing since 2013.
Loans under $2000 stopped appearing since 2011.
Decrease in loan count on early 2013 is visible across all amounts.
Starting 2013 small number of inverstors per loan became much more popular.
LoanOriginalAmount and Term relationship looks as expected. Longterm loans tend to be bigger.
Not fully funded listings have a very currious relationship between LoanOriginalAmount and PercentFunded. Overal correlation is weak, but dots fall into diagonal lines.
Diagonal lines seem to arrive to round loan amount values as they approach full funding. These values probably are the amounts originally asked for.
LenderYield and PercentFunded are unrelated.
Investors and PercentFunded are not related either.
CreditGrade.ProsperRating and LoanOriginalAmount are related. LoanOriginalAmount increases until rating C and then flattens out.
Some listing categories tend to have much bigger loans than others. Debt Consolidation is surprisingly big. Baby&Adoption loans tend to be even bigger than Wedding Loans.
Not surprisingly BorrowerAPR and BorrowerRate have a very strong relationship. What’s unexpected is that it neatly falls into bunch of straight lines.
Homeowners tend to get bigger loans.
Most listing categories have fairly similar popularity between homeowners and non-homeowners. One exception is Home Improvement, with much more homeowners. This makes perfect sense.
Borrowers that are in groups tend to get smaller loans.
Borrowers with non-verifiable income tend to have a little smaller loans and is probably first sub-group I’ve looked at that doesn’t have loans over $25000.
Now that’s some seriously extreme outliers. Let’s zoom in.
Borrowers with verifiale income have tighter distribution of stated monthly income.
Looks like EstimatedReturn has little to do with ListingCategory. More popular categories have more outliers.
Credit grade clearly has a relationship with estimated return. Better the grade smaller and more nealty distributed the estimated return.
Homeowners seem to get larger loans. They also seem to be getting more loans listed in home improvements, business, debt consolidation and childcare categories. That tells a little about social status of homeowners.
Even though most loans are funded by a single investor regardless of loan amount, having large amount of investors (small contributions) seems to be popular regardless of loan amount as well.
Loan amount tencds to be larger for meter credit ratings up to C and stays similar for higher ratings. This could be due to an upper limit of loans amounts visible in dataset.
BorrowerRate and BorroweAPR.
Even though loan amount per investor tends to be much greather for A, B and C ratings, centers of distributions are similar across credit ratings.
PercentFunded doesn’t depend on Investor count.
Investor count tends to be higher for best credit scores.
Throughout listing categories distribution centers of loan amount per investor look similar. Most categories have strongly skewed distributions.
Personal and student loans seem to have more investors per loan amount.
Regardless of credit grade homeowners tend to get larger loans. Both groups are capped at same amounts per credit grade though.
There are some visible caps for lower credit grades. Starting 2013 some were increased.
Lower credit grades tend to get smaller loans and more widely distributed estimated returns.
As expected, borrower rate and estimated return seem related. As not expected, negative estimated returns have a lot of high borrower rates in the mix.
Loan amount doesn’t seem related to borrower rate or estimated return, but their variablity decreases as loan amount increases.
This is useless being so tiny, but we’re looking looking at 5 variables at once!
It’s still good enough to reasert some previous observations that were difficult to see in previous plots. Change of loan amount caps per credit grade over time for example. We also see that negative estimated returns was a temporary thing and starting 2011 disappeared together with a lot of estimated return extreme values.
Homeowners tend to get larger loans across credit grades. I was expecting this effect to diminish for higher credit grades, but that was not the case.
Variability of data points like estimated return decreases as loan amount increases.
Lower credit grade borrowers dominate loan amounts under 5000, but quickly disappear as amount increases.
All densities spike on loan amounts divisible by 5000.
Credit grade clearly has a relationship with estimated return. Better the grade smaller and more nealty distributed the estimated return.
Lower credit grades tend to get smaller loans and more widely distributed estimated returns.
Better credit grades concentrate at the lower end of estimated return with an exception of a bunch of negative estimated returns for credit grade HR.
This was a very challenging dataset to explore. Main struggle was analysis paralysis inducing number of data points. It feels like my explorations barely touched the surface.
It was interesting to see hints of lender policy changes in time series as well as tendency to borrow round amounts. These high density values created difficult overplotting problems.
What was unexpected is how variability of many variables decreases as loan amount and listing creation date increases. This shows that lender has stricter rules for larger loans, as well as rules got stricter between 2010 and 2011 in general.
It was a bit surprising to see such a visible relationship between credit grade and estimated return.
There’s still plenty of data points to investigate in future work. Initially when I started working on this dataset I expected to find clues on how to predict how much could be borrowed based on borrower properties, but it doesn’t seem to contain rejected loans, so that is hardly possible as people don’t always borrow largest amount possible. This dataset would be good to build models though.